Selective Term Proximity Scoring Via BP-ANN
نویسندگان
چکیده
When two terms occur together in a document, the probability of a close relationship between them and the document itself is greater if they are in nearby positions. However, ranking functions including term proximity (TP) require larger indexes than traditional document-level indexing, which slows down query processing. Previous studies also show that this technique is not effective for all types of queries. Here we propose a document ranking model which decides for which queries it would be beneficial to use a proximity-based ranking, based on a collection of features of the query. We use a machine learning approach in determining whether utilizing TP will be beneficial. Experiments show that the proposed model returns improved rankings while also reducing the overhead incurred as a result of using TP statistics.
منابع مشابه
A Short Note on Proximity-based Scoring of Documents with Multiple Fields
e BM25 ranking function is one of the most well known query relevance document scoring functions and many variations of it are proposed. e BM25F function is one of its adaptations designed formodeling documentswithmultiple fields. e Expanded Span method extends a BM25-like function by taking into considerations of the proximity between term occurrences. In this note, we combine these two var...
متن کاملEfficient Text Proximity Search
In addition to purely occurrence-based relevance models, term proximity has been frequently used to enhance retrieval quality of keyword-oriented retrieval systems. While there have been approaches on effective scoring functions that incorporate proximity, there has not been much work on algorithms or access methods for their efficient evaluation. This paper presents an efficient evaluation fra...
متن کاملTerm Proximity Scoring for Keyword-Based Retrieval Systems
This paper suggests the use of proximity measurement in combination with the Okapi probabilistic model. First, using the Okapi system, our investigation was carried out in a distributed retrieval framework to calculate the same relevance score as that achieved by a single centralized index. Second, by applying a term-proximity scoring heuristic to the top documents returned by a keyword-based s...
متن کاملModelling of Conventional and Severe Shot Peening Influence on Properties of High Carbon Steel via Artificial Neural Network
Shot peening (SP), as one of the severe plastic deformation (SPD) methods is employed for surface modification of the engineering components by improving the metallurgical and mechanical properties. Furthermore artificial neural network (ANN) has been widely used in different science and engineering problems for predicting and optimizing in the last decade. In the present study, effects of conv...
متن کاملHeading-Aware Proximity Measure and Its Applica- tion to Web Search
Proximity of query keyword occurrences is one important evidence which is useful for effective querybiased document scoring. If a query keyword occurs close to another in a document, it suggests high relevance of the document to the query. The simplest way to measure proximity between keyword occurrences is to use distance between them, i.e., difference of their positions. However, most web pag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1606.07188 شماره
صفحات -
تاریخ انتشار 2016